3 research outputs found

    Biomedical Relationship Extraction from Literature Based on Bio-Semantic Token Subsequences

    Get PDF
    Relationship Extraction (RE) from biomedical literature is an important and challenging problem in both text mining and bioinformatics. Although various approaches have been proposed to extract protein?protein interaction types, their accuracy rates leave a large room for further exploring. In this paper, two supervised learning algorithms based on newly defined bio-semantic token subsequence are proposed for multi-class biomedical relationship classification. The first approach calculates a bio-semantic token subsequence kernel , whereas the second one explicitly extracts weighted features from bio-semantic token subsequences. The two proposed approaches outperform several alternatives reported in literature on multi-class protein?protein interaction classification

    Examining Granular Computing from a Modeling Perspective

    Get PDF
    In this paper, we use a set of unified components to conduct granular modeling for problem solving paradigms in several fields of computing. Each identified component may represent a potential research direction in the field of granular computing. A granular computing model for information analysis is proposed. The model may suggest that granular computing is an instrument for implementing perception based computing based on numeric computing. In addition, a novel granular language modeling technique is proposed for information extraction from web pages. This paper also suggests that the study of data mining in the framework of granular computing may address the issues of interpretability and usage of discovered patterns

    Hypotheses Generation as Supervised Link Discovery with Automated Class Labeling on Large-scale Biomedical Concept Networks

    Get PDF
    Computational approaches to generate hypotheses from biomedical literature have been studied intensively in recent years. Nevertheless, it still remains a challenge to automatically discover novel, cross-silo biomedical hypotheses from large-scale literature repositories. In order to address this challenge, we first model a biomedical literature repository as a comprehensive network of biomedical concepts and formulate hypotheses generation as a process of link discovery on the concept network. We extract the relevant information from the biomedical literature corpus and generate a concept network and concept-author map on a cluster using Map-Reduce framework. We extract a set of heterogeneous features such as random walk based features, neighborhood features and common author features. The potential number of links to consider for the possibility of link discovery is large in our concept network and to address the scalability problem, the features from a concept network are extracted using a cluster with Map-Reduce framework. We further model link discovery as a classification problem carried out on a training data set automatically extracted from two network snapshots taken in two consecutive time duration. A set of heterogeneous features, which cover both topological and semantic features derived from the concept network, have been studied with respect to their impacts on the accuracy of the proposed supervised link discovery process. A case study of hypotheses generation based on the proposed method has been presented in the paper
    corecore